-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Async jupyter client #10
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for putting this together! I need to do some more thorough testing, but there are a few things we need to figure out in the meanwhile.
A simple one is that #12 (and other PRs touching execute_cell) will conflict with this PR since it touches most public methods so it'll probably become stale as smaller PRs merge in. If we start something experimental and want more incremental PRs it might make sense for us to work out of a branch.
A larger concern that I'm not sure how to completely bridge is that we need to support python 3.5+ (probably can cut to 3.6+ later in the year). I haven't looked into what the best approach is to support async across such versions when one needs to support synchronous calls back in pre-async python verisons. Perhaps we could build a wheel that has different files imported depending on the python version, though that might be tricky. Another option is to release two modules, nbclient and nbclient-async independently. Open to suggestions on this topic.
nbclient/execute.py
Outdated
loop = get_loop() | ||
return loop.run_until_complete(self.async_run_cell(cell, cell_index, store_history)) | ||
|
||
async def poll_output_msg(self, parent_msg_id, cell, cell_index, timeout=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will need to do some more manual testing to ensure we correctly timeout, pass execution context, and handle errors cleanly. Some of those patterns are tested in the synchronous path with some assumptions that might not hold for the async version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to explain a bit more how I implemented the asynchronous functionality, there are now two tasks that are run in parallel:
_poll_for_reply
: asynchronously awaits a reply on the shell channel (possibly with timeout).poll_output_msg
: asynchronously awaits a message on the IOPub channel for cell execution. We first await as long as no reply is received on the shell channel, and when a reply is received on the shell channel we cancel this task and launch it again with a timeout.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, that helps!
nbclient/tests/test_execute.py
Outdated
@@ -270,6 +295,55 @@ def test_many_parallel_notebooks(capfd): | |||
assert captured.err == "" | |||
|
|||
|
|||
def test_async_parallel_notebooks(capfd, tmpdir): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Glad these tests are here to validate parallel execution
tox.ini
Outdated
deps = .[dev] | ||
commands = pytest -vv --maxfail=2 --cov=nbclient -W always {posargs} | ||
commands = | ||
/bin/bash -c 'python -m pip install -e git+https://[email protected]/davidbrochart/jupyter_client.git@async_client#egg=jupyter_client' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can put that in the requirements file to test as a git path without needing to toss it in the commands. That also makes the package install cleaner for local development and indicates the branch is dependent on an unreleased code dependency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to put it in the requirements file but never managed to make it work, that's why it ended up here (I had an error about pip not being able to parse it).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all good, there's a way to make it happy but we can instead just focus on getting async support merged and released in jupyter_client as a pre-req for this PR to be released.
Thanks for reviewing @MSeal. There are a couple of issues related to compatibility of async with older Python versions:
That being said, I think we should work on fixing these issues and not have a |
That seems a bit overkill to me! |
Trigger CI. |
@MSeal I managed to get Python compatibility from 3.5 to 3.8, but it looks like the CI dies randomly for |
Trigger CI. |
I've seen python 3.5 fail randomly in nbconvert and now in nbclient where the kernel dies unexpectantly due to port conflicts. I believe that's related to ipython no longer releasing 3.5 versions (
is new to me -- that looks like something different is happening. |
nbclient/execute.py
Outdated
) | ||
exec_reply = await self._poll_for_reply(parent_msg_id, cell, exec_timeout) | ||
if not task_poll_output_msg.done(): | ||
task_poll_output_msg.cancel() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Never knew about cancel, can this mean a message can get lost?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so, because messages are queued anyway in zmq. And we create a new task polling for messages just after that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But can it cancel, when it's off the zmq queue, say, it's in the json parser. How does it cancel like Ctrl-c.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can't do anything between two await
, so it depends on the implementation. For instance, in zmq.asyncio.Socket.recv_multipart
we might await zmq.asyncio.Socket.recv
several times and so cancelling in the middle of it can lead to losing a message, you're right.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another way that I tried out was to leave the first poll_output_msg
task running (with no timeout) and launch another one with a timeout when _poll_for_reply
is done (and if the first poll_output_msg
task is not done), and then asyncio.wait(both_tasks, return_when=asyncio.FIRST_COMPLETED)
, but it doesn't handle exceptions well (there is a return_when=asyncio.FIRST_EXCEPTION
for that).
I can't think of a better solution right now.
@MSeal there is an accurate description of that issue in jupyter/jupyter_client#487 by @JohanMabille. Basically the whole handshake model of the kernel protocol is an antipattern. The client should not tell the kernel which ports to use. It should only open one socket for the kernel to communicate back the available ports for itself... |
@maartenbreddels I reworked the polling on the shell and iopub channels. Now the polling on iopub channel task is run only once (with no timeout), and when we have an exec reply we |
Sounds brilliant, I'll review next week!
(from mobile phone)
…On Sat, 8 Feb 2020, 13:50 David Brochart, ***@***.***> wrote:
@maartenbreddels <https://github.com/maartenbreddels> I reworked the
polling on the shell and iopub channels. Now the polling on iopub channel
task is run only once (with no timeout), and when we have an exec reply we
wait_for that task with a timeout, which will cancel the task after the
timeout. That way there is no task rescheduling and no risk to loose a
message.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#10?email_source=notifications&email_token=AANPEPM2JSE4K5C5EAYJVL3RB2TCDA5CNFSM4KOZCYWKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOELFREZI#issuecomment-583733861>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AANPEPMG3BQMHVXCYII7SFTRB2TCDANCNFSM4KOZCYWA>
.
|
@MSeal jupyter_client now has an async API (jupyter/jupyter_client#506 has been merged). How would you like to proceed? I guess we should make a release of jupyter_client before merging this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for getting this all in place and updated!
This pull request has been mentioned on Jupyter Community Forum. There might be relevant details there: https://discourse.jupyter.org/t/welcome-new-jupyter-client-nbclient-maintainer-david-brochart/3467/1 |
This is a follow-up of #6. It has to be used with jupyter/jupyter_client#506.